Goto

Collaborating Authors

 k-nearest neighbour


Conformal Prediction for Multimodal Regression

Bose, Alexis, Ethier, Jonathan, Guinand, Paul

arXiv.org Artificial Intelligence

This paper introduces multimodal conformal regression. Traditionally confined to scenarios with solely numerical input features, conformal prediction is now extended to multimodal contexts through our methodology, which harnesses internal features from complex neural network architectures processing images and unstructured text. Our findings highlight the potential for internal neural network features, extracted from convergence points where multimodal information is combined, to be used by conformal prediction to construct prediction intervals (PIs). This capability paves new paths for deploying conformal prediction in domains abundant with multimodal data, enabling a broader range of problems to benefit from guaranteed distribution-free uncertainty quantification.


Target Strangeness: A Novel Conformal Prediction Difficulty Estimator

Bose, Alexis, Ethier, Jonathan, Guinand, Paul

arXiv.org Artificial Intelligence

This paper introduces Target Strangeness, a novel difficulty estimator for conformal prediction (CP) that offers an alternative approach for normalizing prediction intervals (PIs). By assessing how atypical a prediction is within the context of its nearest neighbours' target distribution, Target Strangeness can surpass the current state-of-the-art performance. This novel difficulty estimator is evaluated against others in the context of several conformal regression experiments.


Application of Machine Learning Algorithms in Classifying Postoperative Success in Metabolic Bariatric Surgery: A Comprehensive Study

Benítez-Andrades, José Alberto, Prada-García, Camino, García-Fernández, Rubén, Ballesteros-Pomar, María D., González-Alonso, María-Inmaculada, Serrano-García, Antonio

arXiv.org Artificial Intelligence

Objectives: Metabolic Bariatric Surgery (MBS) is a critical intervention for patients living with obesity and related health issues. Accurate classification and prediction of patient outcomes are vital for optimizing treatment strategies. This study presents a novel machine learning approach to classify patients in the context of metabolic bariatric surgery, providing insights into the efficacy of different models and variable types. Methods: Various machine learning models, including GaussianNB, ComplementNB, KNN, Decision Tree, KNN with RandomOverSampler, and KNN with SMOTE, were applied to a dataset of 73 patients. The dataset, comprising psychometric, socioeconomic, and analytical variables, was analyzed to determine the most efficient predictive model. The study also explored the impact of different variable groupings and oversampling techniques. Results: Experimental results indicate average accuracy values as high as 66.7% for the best model. Enhanced versions of KNN and Decision Tree, along with variations of KNN such as RandomOverSampler and SMOTE, yielded the best results. Conclusions: The study unveils a promising avenue for classifying patients in the realm of metabolic bariatric surgery. The results underscore the importance of selecting appropriate variables and employing diverse approaches to achieve optimal performance. The developed system holds potential as a tool to assist healthcare professionals in decision-making, thereby enhancing metabolic bariatric surgery outcomes. These findings lay the groundwork for future collaboration between hospitals and healthcare entities to improve patient care through the utilization of machine learning algorithms. Moreover, the findings suggest room for improvement, potentially achievable with a larger dataset and careful parameter tuning.


Time-Series Forecasting: Unleashing Long-Term Dependencies with Fractionally Differenced Data

Maitra, Sarit, Mishra, Vivek, Dwivedi, Srashti, Kundu, Sukanya, Kundu, Goutam Kumar

arXiv.org Artificial Intelligence

This study introduces a novel forecasting strategy that leverages the power of fractional differencing (FD) to capture both short- and long-term dependencies in time series data. Unlike traditional integer differencing methods, FD preserves memory in series while stabilizing it for modeling purposes. By applying FD to financial data from the SPY index and incorporating sentiment analysis from news reports, this empirical analysis explores the effectiveness of FD in conjunction with binary classification of target variables. Supervised classification algorithms were employed to validate the performance of FD series. The results demonstrate the superiority of FD over integer differencing, as confirmed by Receiver Operating Characteristic/Area Under the Curve (ROCAUC) and Mathews Correlation Coefficient (MCC) evaluations.


Categorising Products in an Online Marketplace: An Ensemble Approach

Drumm, Kieron

arXiv.org Artificial Intelligence

In recent years, product categorisation has been a common issue for E-commerce companies who have utilised machine learning to categorise their products automatically. In this study, we propose an ensemble approach, using a combination of different models to separately predict each product's category, subcategory, and colour before ultimately combining the resultant predictions for each product. With the aforementioned approach, we show that an average F1-score of 0.82 can be achieved using a combination of XGBoost and k-nearest neighbours to predict said features.


Generate synthetic samples from tabular data

Banh, David, Huang, Alan

arXiv.org Artificial Intelligence

Generating new samples from data sets can mitigate extra expensive operations, increased invasive procedures, and mitigate privacy issues. These novel samples that are statistically robust can be used as a temporary and intermediate replacement when privacy is a concern. This method can enable better data sharing practices without problems relating to identification issues or biases that are flaws for an adversarial attack.


K-Nearest Neighbours - GeeksforGeeks

#artificialintelligence

K-Nearest Neighbours is one of the most basic yet essential classification algorithms in Machine Learning. It belongs to the supervised learning domain and finds intense application in pattern recognition, data mining and intrusion detection. It is widely disposable in real-life scenarios since it is non-parametric, meaning, it does not make any underlying assumptions about the distribution of data (as opposed to other algorithms such as GMM, which assume a Gaussian distribution of the given data). We are given some prior data (also called training data), which classifies coordinates into groups identified by an attribute. Now, given another set of data points (also called testing data), allocate these points a group by analyzing the training set.


Does predict function work in parallel when predicting k-nearest neighbour?

#artificialintelligence

I have a k-nearest neighbour classifier which I have trained with fitcknn. I am wondering, when predicting labels on the model using predicit does it work in parallel? I have tested using predict in a for loop and parfor loop. The simple for loop performs a bit faster which makes me think there is some optimisation and built in parallelisation that the predict function is taking advantage of. However, the documentation makes no reference to this, and I thought MATLAB always runs in a single thread unless specifically using a parallel pool?


Application of Computer Vision : Object Classification

#artificialintelligence

Object classification from a photographic image is a complex process and is fast becoming an important task in the field of computer vision. Real-time object classification from images has been used in various fields such as healthcare, manufacturing, retail, etc. Object classification from photographic images is a technique that includes classifying or predicting the class of an object in an image, with a goal to accurately identify the feature in an image. Object classification includes labelling and classifying the images into predefined classes based on the feature/object observed. Object Classification from images is an important application in the domain of Computer Vision and the field involves different techniques and algorithms to acquire, analyse, and process the images. To put it common terms, Object Classification from images is a process of classifying and predicting the class of the objects in an image, with a goal to unambiguously distinguish the feature/object in the image. In general, object classification is an algorithm that takes in a set of features that represent the objects in the image and makes use of the same to predict the class for each object.


A Beginners Guide to Deep Metric Learning

#artificialintelligence

Learning the similarity between objects has a dominant role in human cognitive processes and artificial systems for recognition and classification. Using an appropriate distance metric, the metric learning attempts to quantify sample similarity while conducting learning tasks. Metric learning techniques, which typically use a linear projection, are limited in their capacity to tackle non-linear real-world scenarios. Kernel approaches are employed in metric learning to overcome this problem. In this post, we will understand what metric learning and deep metric learning are and how deep metric learning can address the challenges faced by metric learning.